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(57) Abstract 

The invention relates to an adaptive micro- 
| phone arrangement with one or more microphones 
(MPi, MPn) comprising a signal detecting ar- 
| rangement for detecting target input signals, a signal 
| forming arrangement and signals storing means. The 
input signals comprise a calibration signal (mi, .... 
mo) and a second noise signal (Ni, Nn) wherein 
the calibration input signal is recorded and stored in 
a storing means (2). The signal forming arrange- 
ment comprises a first signal forming means (4) and 
a second signal forming means (5) wherein the first 
signal forming means (4) comprises adapting means 
for treating the sum of the calibration signal and a 
noise signal to provide filtering coefficients which 
then are copied to and used in the second signal 
fonriing means (5) on the target-noise input signal 
and wherein the adapting signals and the target-noise 
signals are input under essentially the same condi- 
tions. 
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1 

ADAPTIVE MICROPHONE ARRANGEMENT AND METHOD FOR ADAPTING TO AN 
INCOMING TARGET-NOISE SIGNAL 

FIELD OF THE INVENTION 

5 

The present invention relates to an adaptive microphone ar- 
rangement as referred to in the first part of claim 1. The 
invention furthermore relates to a method for adapting to an 
incoming target signal. 

10 The conditions under which a microphone arrangement is to be 
used vary to a great extent. Sometimes the environment is very 
noisy, as for example in a car or any moving vehicle or simi- 
lar, moreover also in workshops, storehouses etc- When so 
called hands-free operation is applied, the requirements on the 

15 microphone arrangement is even more demanding among others due 
to the distance from the source of the speech or whatever it 
may, be to the microphones. E.g. the noisy environment in a car 
severely degrades the performance of so called hands free 
mobile telephones and speech recognition devices. 

20 

STATE OF THE ART 

A number of attempts have been done to improve the quality of 
e.g. speech signals in noisy environments. These solutions are 
25 generally based on spectral subtraction, Wiener- filtering and 
array technics. Thus, through the use of speech generating 
models and various algorithms, more and more a priori informa- 
tion about speech signals is used in speech recognition and 
speech coding arrangements. In order to ensure that the a 
30 priori information is correct, an acceptable signal-to-noise 
ratio is required which in turn implies a need for noise reduc- 
tion under various adverse conditions such as e.g. hands-free 
operation of telephones or speech recognition in cars. With the 
abovementioned solutions the signal-to-noise ratio has been 
35 increased. For example methods or arrangements based on special 
filtering adaptive microphone arrays. However, in all these 
known applications the near field considerations are of great 

CONFIRMATION 
COPY 
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importance. This in turn leads to problems which are hard to 
describe in detailed manner with an a priori model. For example 
a beamformer with inherent theoretical modelling representing 
the acoustic field in a car will with considerable probability 

5 provide an array which is based on o priori information which 
partly is incorrect. In known arrangements (such as e.g. 
"Methods for noise reduction applied to speech input systems" 
by K. Krochel in Proc. VLSI and Computer Peripherals VLSI and 
Microelectronic Applications in Intelligent peripherals and 
10 their interconnection. Network", 8-12 May 1989 p. 2/82-87, 
automatical modulation is used to provide the wanted signals. 
This does however not work if the a priori information is wrong 
or incorrect and furthermore requires advanced mathematical 
models etc. Moreover in a number of known arrangement or 

15 methods so called noise cancellers are used and they are built 
on reduction principles etc. which are complicated and general- 
ly it does not give a very good result but leads to complicated 
arrangements etc. 

20 SUMMARY OF THE INVENTION 

It is an object of the present invention to provide an adaptive 
microphone arrangement as initially referred to which gives a 
good signal-to-noise ratio. It is furthermore an object of the 

25 invention to provide an arrangement wherein no mathematical 
modelling is required about signal statistics, aconstic field, 
microphone array geometry and characteristics of the electronic 
equipment etc. further object with invention is to provide an 
arrangement which can be used for so called hands-free opera- 

30 tion in cars etc. and which provides good output signals, e.g. 

a signal, having a good signal-to-noise ratio. Still another 
object of invention is to provide a microphone arrangement 
which is insensitive to channel mismatch. 

Still another object with invention is to provide an arrange- 
35 ment which is robust and easy to use. Further objects with the 
invention is to provide an arrangement wherein a calibration 
signal or a reference signal is easily obtained and which is., 
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■to the greatest extent possible, correct. It is also an object 
of the invention to provide an arrangement wherein the discrep- 
ancy between the filtering functions during speech recognition 
training and operation is small when used with speech recogni- 
■ 5 tion devices. 

A further object of the invention is to provide a method for 
adapting to an incoming target signal. 

These as well as other objects are achieved through an arrange- 
10 ment having the characteristics of the characterizing part of 
claim 1 . 

The objects are moreover achieved through a method having the 
characteristics of claim 20. 

A number of advantageous embodiments are given by the appended 
15 subclaims. Advantageously the signal forming arrangement 
comprises an adaptive beamformer and a filtering beamformer. 
In a particularly advantageous embodiment the calibration 
signal is a speech signal or even more particularly a typical 
speech signal or a signal with a speech influenced spectrum. 
20 Most particularly the calibration signal is recorded on site, 
i.e. it is recorded using the same equipment and in an ad- 
vantageous embodiment at the same location as when the target 
target-noise signal is produced. Preferably the storage com- 
prises a digital storage, or even more particularly one digital 
25 storage for each input calibration signal, each for a separate 
microphone. The calibration signal may comprise a number of 
(secondary) calibration signals, i.e. calibration signals from 
each microphone which are combined into a so called desired 
signal. For adaptation of the sum of the calibration signal and 
30 the noise signal which according to an advantageous embodiment 
comprises pure noise, the adapting means uses an adaptive al- 
gorithm which e.g. may be the so called LMS (Least Mean Square) 
algorithm or some other algorithm, for example the RLS (Recu- 
rsive Least Square) or any other . appropriate algorithm. Par- 
35 ticularly either one of the calibration signals or a combina- 
tion of two or more thereof often is used as a so called 
desired signal in the algorithm means with which the sum of the 
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calibration signal and the noise signal is compared in a manner 
known per se. During adaptation, during which no target signal 
or no speech is provided, a number of filtering coefficients 
are obtained in the adaptive beamf ormer in a manner known per 
5 se* The filtering coefficients are copied to and used in the 
second beamf ormer or the filtering beamf ormer. When a target 
(target-noise) signal is input, or a speaker or similar is 
active, the adaptation of the adaptive beamf ormer is switched 
off and no adaptation takes place. Then the target signal or 

10 e.g. the speech signal is filtered through the filtering 
beamf ormer. Generally the first and second beamf ormers comprise 
filters such as e.g. FIR-f liters (Finite Impulse Response), the 
adaptation coefficients thereof being optimized adaptively to 
the actual noise level or noise situation and to the equipment 

15 "on site". 



BRIEF DESCRIPTION OF THE DRAWINGS 



The invention will in the following be further described in a 
20 non-limiting way under reference to the accompanying drawings 
wherein: 

Fig. 1 illustrates a calibration phase and 
Fig. 2 Illustrates an adaptive filtering phase. 



25 



DETAILED DESCRIPTION OF THE INVENTION 



In the following an embodiment will be described wherein an 
array of microphones is arranged for example in a car. In the 

30 Figs, an array comprising n microphones (MF^ MP 2 ,..., MP n ) is 
illustrated wherein n can be any number from one upwards and 
is chosen depending on the actual circumstances and the relev- 
ant environment. Thus there may be either one or more micro- 
phones. In one particular embodiment 8 microphone are used but 

35 this of course merely constitutes an example. The microphones 
may be of any appropriate quality or of any kind. If however 
they are of a standard quality, they generally have a con- 
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5 

siderable spread in performance which in turn poses high 
demands on the beamformer as to easily incorporate a calibra- 
tion step. According to the invention training sequences are 
recorded from different positions in the environment of e.g. 
5 a true speaker position in a real situation with the actual 
system and with no noise present. The training sequences or the 
calibration signals are then gathered into a storage and later 
used as so called training signals in the adaptive phase. 
Therethrough an inherent calibration signal is obtained and it 
10 is generally possible to wheigh interesting frequency bands 

and spatial points. The arrangement according to the invention 
is accurate for the actual situation and it does not depend on 
the geometry of the array of microphones or similarities 
between elements or on calibration or matching of amplifiers 
15 or other electronic equipment etc. The microphone arrangement 
generally uses two sets of input data, namely the target-noise 
signals in a filtering beamf ormer and the recorded calibrations 
signals plus the noise signals in the adaptive beamf ormer. In 
the first and second beamf ormer respectively, i.e. the adaptive 
20 beamf ormer and the filtering beamf ormer respectively , the 
signals are filtered with so called FIR-filters or Finite 
Impulse Response filters or a so called tapped delay line, 
which carries out a linear combination of input data. 
In the described embodiment, the microphone arrangement may 
25 particularly be used for so called hands free operation. 

The microphone arrangement according to Fig. 1 comprises a 
number of microphones MP a , MP 2/ , MP n wherein the micropho- 
nes are arranged and placed in any desired manner. The input 

calibration signals M lr , undergo, an anti - aliasing 

30 operation and an A/D conversion in a conversion block 1 where- 
after the signals, now designated m x , m 2 , , m,, are recorded 

in a calibration signal storage 2. The calibration signals m x , 
. . . , are also used in the adaptive .means as will be further 
described later on. The calibration signal is to be provided 
35 as a pure calibration signal, i.e. in the case of a car or 
similar there should generally be no noise upon their genera- 
tion and recording, i.e. the car should be parked with no motor 
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on etc. Then a -typical speech signal or signal with a speech 
influenced spectrum, from the typical speaker position, is 
recorded in the calibration signal storage 2. This is preferab- 
ly a digital storage or more particularly a number of digital 
5 storages, each for one microphone channel. These recorded 
signals form calibration signals m x , . .., n^. The adaptive 
means or the adaptive beamformer 4 can advantageously be 
calibrated on-site in a car or similar e.g. by using either a 
loudspeaker or letting the speaker read a representative 
10 sequence. The sequences received in each microphone channel are 
gathered into the calibration signal storage 2. This means that 
the channels from the speaker or the loudspeaker or similar to 
A/D converters are included. As already mentioned above, the 
environmental noise level should be as low as possible in order 
15 to obtain a good signal-to-noise ratio in a desired signal 
which may be either one of the input calibration signals m lr 
. .., m^, or a combination of two or more of the calibration 
signals m 1# m 2 , , r^. In a preferred embodiment the situa- 
tion as well as the equipment is generally time-invariant 
20 wherethrough the microphone arrangement has been provided with 
calibration signals which can be combined to form the desired 
signals as referred to above. Further, as already discussed 

above, the separate microphones MP X , MP 2 MP n and their 

placement can be chosen in any appropriate manner . According 
25 to a preferred embodiment, for the obtaining of a robust 
system, the speaker position or the loudspeaker position is 
changed in such way that it is moved around and in the vicinity 
of the speakers normal position during the recording of the 
calibration signal into the storage. The recorded calibration 
30 signals from different positions are according to a preferred 
embodiment superimposed to provide weighted average training 
signals or calibration signals or reference signals. As already 
referred to above, these signals are gathered into the storage- 
2. As can be seen from Fig. 2, those signals, m l , m 2 , and m r 
35 which forms so called calibration signals, or reference sig- 
nals, are then used * as well as training signals as, e.g. in a 
combined form, as a desired signal or reference signal for use 
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7 

during adaptation. 

After the calibration phase, wherein the calibration signals 
are recorded and stored in the storage, an adaptive phase 
5 follows. During this phase there is no calibration input 
signal. The situation is very generally a noisy situation, in 
the case of the car it may relate to a situation wherein the 
speaker is silent and wherein the car is moving, i.e. the motor 
is running etc. The input signals to the adapting beamformer 
10 4 are formed by the sum of the in the storage 2 stored calibra- 
tion signals m 1 , m 2 , . ..,1^ and the noise signals N a , N 2 , , 

N n respectively. Thus the speaker or similar is silent. The 
storage also comprises an arrangement (not shown) wherein e.g. 
a combined desired signal m r is formed. This might also be 
15 formed by one of the input calibration signals m x , . . . , or 
a combination of just some of them. To the adaptive beamformer 
4 is introduced stored speech signals m x , . . - , m n plus noise N 2 , 
N n . A known reference signal or a desired signal m r which 
has passed through the same electronic equipment when no noise 
20 was present is also obtained. The adaptive filters of the 
adapting beamformer 4 therethrough are provided with all the 
information that is needed to adapt to the correct filter 
coefficients e.g. in the least square sense or applying the 
LMS-algorithm (or any other appropriate algorithm). In a manner 
25 known per se the reference signal or the desired signal m r is 
subtracted from the output signal m bl from the adaptive beam- 
former 4 and the difference € is formed etc. Thus the signals 
from a "typical" speaker and the real speaker originate from 
the same acoustical environment and meet the same electronic 
30 equipment etc. Therefore the adaptive microphone arrangement 
will be calibrated "on site" to the prevailing acoustic enviro- 
nment and to the placement of the microphones etc. as well as 
to the individual properties of the microphones, amplifiers, 
A/D - converters and so on. 
35 When the coefficients of the digital filters of the adaptive 
beamformer 4 has been optimized adapt ively to the current noise 
situation and to the actual equipment, these are copied to the 
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second beamformer or the filtering beamformer 5. The filtering 
beamformer 5 operates when the speaker or similar is active. 
When the speaker (or similar) is active, the adaptation is 
switched off r either automatically or manually e.g. by a "push- 
5 to- talk"- function. This relates to a preferred embodiment; it 
is however not necessary. If the adaptation is switched off, 
however, this is done to avoid echo-effects and/or to provide 
a more robust system so that the adaptive filters cannot 
operate on the real speech signal. The target signal or the 
10 speech signal, comprising speech plus noise, sn^ sn 2 , . .., sn 3 
is merely filtered through the filtering beamformer 5. During 
the filtering in the filtering beamformer 5, the filtering 
coefficients are fixed and the output signal is obtained from 
the filtering beamformer 5. According to a preferred embodi- 
15 ment, as soon as the speaker stops to speak, the adaptation in 
the adapting beamformer is continued. The filtering beamformer 
preferably works continuously and without any calibration 
signal. 

The different components of the microphone arrangement can be 
20 of any desired kind. A number of different known microphone 
types can be used. Different filters can also be used of which 
so called FIR-filters merely constitute one example. Also the 
storage can be chosen in any appropriate way. The sampling 
frequency may likewise take a number of different values. 
25 The invention may also in a number of other aspect be varied 
in a number of different ways merely being limited by the scope 
of the claims - 

30 



35 



WO 95/34983 



PCT/SE95/00718 



CLAIMS 

1. Adaptive microphone arrangement with at least one micro- 
5 phone (WP lf MP 2 , . .., MP„) wherein the arrangement comprises a 
signal detecting arrangement for detecting target input sig- 
nals, a signal forming arrangement and signal storing means, 
characterized in that, 

the input signals comprise a calibration signal (n^, m„) 

10 a noise signal <N X N„) and a target-noise signal (sn, , 

sn,) wherein the calibration input {m lr , m,,) signal is 

recorded and stored in the storing means (2) and in that the 
signal forming arrangement comprises a first signal forming 
means (4) and a second signal forming means (5) wherein the 
15 first signal forming means (4) comprises adapting means treat- 
ing the sum of the calibration signal (m x , m„) from the 
storage (2) and the noise signal (N 1# N n ) thereby provid- 
ing filtering coefficients and in that the filtering coeffi- 
cients obtained from the first signal forming means (4) are 
20 used in the second signal forming means ( 5 ) on the target input 
signal and in that the calibration and noise signals and the 
target-noise signals are input under essentially the same 
conditions . 

25 2. Arrangement according to claim 1, 

characterized in that, 

the adapting means (4) is an adaptive beamformer. 

3. Arrangement according to claim 2, 

30 characterized in that, 

the second signal forming means (5) is a filtering beamformer. 

4. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the noise signal (N^ N„) is a pure noise signal. 



35 



5. Arrangement according to anyone of claims 1-4, 
characterized in that, 
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10 



15 



10 

the calibration signal (m, n^) is a speech signal. 

6. Arrangement: according to claim 5, 
characterized in that, 

-the calibration signal (m^ • nO is a typical speech signal 
or speech- spectrum influenced- signal. 

7. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the calibration signal (m^ . .., n^) is recorded on site. 

8. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the storage (2) comprises at least one digital storage. 



9. Arrangement according to claim 8, 
characterized in that, 

the input calibration signal comprises a number of calibration 

signals (m x nO which are combined into a desired signal 

20 (m r ). 

10. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the adapting means (4) uses an adaptive algorithm (LSM, RMS) 
25 for the adaptation. 

11. Arrangement according to claim 10, 
characterized in that, 

the adaptive algorithm is the so called LMS-algorithm or some 
30 other gradient algorithm or similar. 

12. Arrangement according to claim 11, 
characterized in that, 

one of the calibration signals (M 2 Mj or a combination 

35 thereof is used as a so called desired signal in the adaptive 
algorithm. 
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15 



11 

13. Arrangement according to anyone of the preceding claims, 
characterized in.t hat, 

an essentially pure noise signal (N 1# . .., N n ) is combined with 
the essentially pure calibration signal (m x/ mj in the 

first adaptive beamformer (4) wherein the adaption coefficients 
are so adapted that the output signal from the adapting beam- 
former (4) is similar to a combination of the calibration 
signals . 

14. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the filtering coefficients obtained from the adaptation in the 
adaptive beamformer (4) are copied to, and used in, the filter- 
ing beamformer ( 5 ) . 



15. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the adaptation by the adaptive beamformer (4) is switched off 
when a speaker or similar is active, i.e. up on input of the 
20 target-noise signal. 

16. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the target signal or the speech signal merely is filtered 
25 through the filtering beamformer ( 5 ) . 

17. Arrangement according to anyone of the preceding claims, 
characterized in that, 

it comprises one single microphone (MP X ). 

30 

18. Arrangement according to anyone of claims 1-16, 
characterized in that, 

it comprises an array of microphones ( MP X , , MP„ ) - 

35 19. Arrangement according to anyone of the preceding claims, 
characterized in that, 

the first and second beamformers (4,5) comprise filters such 
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as e.g. FIR-filter (Finite- Impulse Response Filters) and in 
that: -the adaptive coefficients thereof are optimized adaptively 
to the actual noise level or noise situation and to the equip- 
ment on site. 

. 5 

20. Method for adapting to an incoming target-noise signal com- 
prising the steps of: 

recording at least one calibration signal coming essen- 
10 tially from the same physical position and meeting essen- 

tially the same equipment as a target-noise signal, 



15 



20 



storing the calibration signal (s) in the digital stor- 
age, 

- providing an adaptive beamf ormer with the sum of a pure 
noise signal and the calibration signal(s) and one of or 
a combination of the calibration signals as desired 
signal for the adapting process, 

carrying out an adaption in the adaptive beamformer 
providing adapting coefficients such that the adaptive 
beamformer suppresses surrounding noise and listens to a 
calibration signal, 

copying the adapting coefficients from the adaptive 
beamformer to a filtering beamformer, 

providing the filtering beamformer with a target-noise 
30 signal during the provision of which the adaptation is 

switched off, 

providing an output signal from the filtering beamformer. 



25 



35 



21. Method according to claim 20 wherein the filtering beam- 
former works without a calibration signal in a continuous 
manner . 
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